Goto

Collaborating Authors

 game log


Collaborative Quest Completion with LLM-driven Non-Player Characters in Minecraft

arXiv.org Artificial Intelligence

The use of generative AI in video game development is on the rise, and as the conversational and other capabilities of large language models continue to improve, we expect LLM-driven non-player characters (NPCs) to become widely deployed. In this paper, we seek to understand how human players collaborate with LLM-driven NPCs to accomplish in-game goals. We design a minigame within Minecraft where a player works with two GPT4-driven NPCs to complete a quest. We perform a user study in which 28 Minecraft players play this minigame and share their feedback. On analyzing the game logs and recordings, we find that several patterns of collaborative behavior emerge from the NPCs and the human players. We also report on the current limitations of language-only models that do not have rich game-state or visual understanding. We believe that this preliminary study and analysis will inform future game developers on how to better exploit these rapidly improving generative AI models for collaborative roles in games.


Automatic Bug Detection in LLM-Powered Text-Based Games Using LLMs

arXiv.org Artificial Intelligence

Advancements in large language models (LLMs) are revolutionizing interactive game design, enabling dynamic plotlines and interactions between players and non-player characters (NPCs). However, LLMs may exhibit flaws such as hallucinations, forgetfulness, or misinterpretations of prompts, causing logical inconsistencies and unexpected deviations from intended designs. Automated techniques for detecting such game bugs are still lacking. To address this, we propose a systematic LLM-based method for automatically identifying such bugs from player game logs, eliminating the need for collecting additional data such as post-play surveys. Applied to a text-based game DejaBoom!, our approach effectively identifies bugs inherent in LLM-powered interactive games, surpassing unstructured LLM-powered bug-catching methods and filling the gap in automated detection of logical and design flaws.


Dominion: A New Frontier for AI Research

arXiv.org Artificial Intelligence

Games have long played a role in AI research, both as a test-bed, and as a moving goal-post, constantly driving innovation. From the heyday of chess agents, when Deep Blue beat Gary Kasparov, to more recent advances, like AlphaGo's dark horse ascent to fame, games have both assisted AI research and provided something to aim for. As the AIs got better, the games they were applied to also got more complex. New game mechanics, such as the fog of war in StarCraft and the stochasticity of Poker, pushed researchers to adapt their methods to ever greater generality. In this paper, we argue that the deck-building strategy game Dominion [1] deserves to join the ranks of AI benchmark games, providing an RL-based bot in service of that benchmark. Dominion has all of the abovementioned elements, but it also incorporates a mechanic that is not present in other popular RL benchmarks: every game is played with a different set of cards. Since each dominion card has a specific rule printed on it, and the set of 10 cards for a game are randomly picked from among hundreds of cards, no two games of Dominion can be approached the same way. Thus a key part of playing Dominion is adapting one's inductive bias of how to play to the specific cards on the table.


Tachikuma: Understading Complex Interactions with Multi-Character and Novel Objects by Large Language Models

arXiv.org Artificial Intelligence

Recent advancements in natural language and Large Language Models (LLMs) have enabled AI agents to simulate human-like interactions within virtual worlds. However, these interactions still face limitations in complexity and flexibility, particularly in scenarios involving multiple characters and novel objects. Pre-defining all interactable objects in the agent's world model presents challenges, and conveying implicit intentions to multiple characters through complex interactions remains difficult. To address these issues, we propose integrating virtual Game Masters (GMs) into the agent's world model, drawing inspiration from Tabletop Role-Playing Games (TRPGs). GMs play a crucial role in overseeing information, estimating players' intentions, providing environment descriptions, and offering feedback, compensating for current world model deficiencies. To facilitate future explorations for complex interactions, we introduce a benchmark named Tachikuma, comprising a Multiple character and novel Object based interaction Estimation (MOE) task and a supporting dataset. MOE challenges models to understand characters' intentions and accurately determine their actions within intricate contexts involving multi-character and novel object interactions. Besides, the dataset captures log data from real-time communications during gameplay, providing diverse, grounded, and complex interactions for further explorations. Finally, we present a simple prompting baseline and evaluate its performance, demonstrating its effectiveness in enhancing interaction understanding. We hope that our dataset and task will inspire further research in complex interactions with natural language, fostering the development of more advanced AI agents.


Playing the Werewolf game with artificial intelligence for language understanding

arXiv.org Artificial Intelligence

The Werewolf game is a social deduction game based on free natural language communication, in which players try to deceive others in order to survive. An important feature of this game is that a large portion of the conversations are false information, and the behavior of artificial intelligence (AI) in such a situation has not been widely investigated. The purpose of this study is to develop an AI agent that can play Werewolf through natural language conversations. First, we collected game logs from 15 human players. Next, we fine-tuned a Transformer-based pretrained language model to construct a value network that can predict a posterior probability of winning a game at any given phase of the game and given a candidate for the next action. We then developed an AI agent that can interact with humans and choose the best voting target on the basis of its probability from the value network. Lastly, we evaluated the performance of the agent by having it actually play the game with human players. We found that our AI agent, Deep Wolf, could play Werewolf as competitively as average human players in a villager or a betrayer role, whereas Deep Wolf was inferior to human players in a werewolf or a seer role. These results suggest that current language models have the capability to suspect what others are saying, tell a lie, or detect lies in conversations.


Luo

AAAI Conferences

The ability to extract the sequence of game events for a given player's play-through has traditionally required access to the game's engine or source code. This serves as a barrier to researchers, developers, and hobbyists who might otherwise benefit from these game logs. In this paper we present two approaches to derive game logs from game video via convolutional neural networks and transfer learning. We evaluate the approaches in a Super Mario Bros. clone, Mega Man and Skyrim. Our results demonstrate our approach outperforms random forest and other transfer baselines.


Operationalising the data puddle

#artificialintelligence

I've put together a list of the data I want to record and analyse. I've also put together a checklist of the things I'll need to run the D&D campaign that will actually be generating all that beautiful data. Now I need to start operationalising this bad boy. First up, how am I actually going to export the data from all the sources I've identified? I'd rather not be messing around with data scraping, so would prefer (where possible) any tools I use to natively export to convenient file formats.


Player Experience Extraction from Gameplay Video

arXiv.org Artificial Intelligence

The ability to extract the sequence of game events for a given player's play-through has traditionally required access to the game's engine or source code. This serves as a barrier to researchers, developers, and hobbyists who might otherwise benefit from these game logs. In this paper we present two approaches to derive game logs from game video via convolutional neural networks and transfer learning. We evaluate the approaches in a Super Mario Bros. clone, Mega Man and Skyrim. Our results demonstrate our approach outperforms random forest and other transfer baselines.


Help us augment live streams of your game! For science!

@machinelearnbot

Every year our HCI research group defines a number of Master thesis topics for Computer Science and Engineering students. Our topics include data visualisation, recommender systems, augmented reality, learning analytics, and e-health. As my expertise is data visualisation, and video games have always held a special place in my heart, it only made sense to merge the two. That's why this year we have two Master students working on "Designing live data visualisations for the new spectator sport: video games". The general idea is to use live (interactive) visualisations during a game to help a specific audience get a better understanding of and new insights in what is going on.


Decision Generalisation from Game Logs in No Limit Texas Hold'em

AAAI Conferences

Given a set of data, recorded by observing the decisions of an expert player, we present a case-based framework that allows the successful generalisation of those decisions in the game of no limit Texas Hold'em. We address the problems of determining a suitable action abstraction and the resulting state translation that is required to map real-value bet amounts into a discrete set of abstract actions. We also detail the similarity metrics used in order to identify similar scenarios, without which no generalisation of playing decisions would be possible. We show that we were able to successfully generalise no limit betting decisions from recorded data via our agent, SartreNL, which achieved a 5th place finish out of 11 opponents at the 2012 Annual Computer Poker Competition.